Methods for cell free protein expression of mature polypeptides derived from zymogens and proproteins

ABSTRACT

High throughput cell free protein synthesis methods and systems are provided for expression of active, mature enzymes and proteins derived from zymogens and proproteins. Zymogens and proproteins are expressed in their mature, active form in a cell free expression system, without cleavage of a pro-sequence to produce the active polypeptide.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to PCT Application No. PCT/US20/21211, filed Mar. 5, 2020, which is hereby incorporated by reference herein in its entirety.

FIELD OF THE INVENTION

The invention relates to methods for cell free protein expression and protein engineering of mature enzymes and proteins derived from zymogens or proproteins, in particular, methods that eliminate the need for post translational modifications, such as cleavage of a pro-sequence.

BACKGROUND

Directed evolution has emerged as an invaluable tool for supplying industry with cost effective enzyme alternatives to precious metal catalysis and petrochemicals. However, there are several gaps in high throughput (HTP) expression limiting the types of enzymes that can be easily engineered using directed evolution. One of these gaps includes expression of proteins that are difficult to express, such as the mature polypeptide forms of zymogens and proproteins. Zymogens are inactive precursors of enzymes that are often highly toxic to a cell and are therefore expressed with leader sequences that must be cleaved to afford active (mature) enzyme. Proproteins are inactive proteins that can be converted to an active (mature) form by post-translational modification, which generally involves cleavage of part of the protein or addition of a molecule. One common feature to both classes of proteins is the presence of a pro-sequence, cleavage of which is essential for its activation from an inactive to an active state. Organisms often employ protein and enzyme precursors when the subsequent, mature protein is potentially harmful.

Currently, it is difficult to express the mature forms of zymogens in HTP model organisms, such as E. coli, which are frequently employed in directed evolution approaches. Mitigating the mature enzyme's toxicity by expressing as a zymogen complicates HTP enzyme engineering strategies because it requires the addition of exogenous proteases or co-expression of endogenous proteases to afford active enzyme. Non-specific cleavage of the pro-sequence leads to variability in activation of the enzyme and often results in excess proteolysis of the enzyme. A general and reproducible HTP expression system for zymogens and proproteins would be of great interest to industry for rapid and modular expression of mature enzymes and proteins in industrial, pharmaceutical, and consumer applications.

A conventional directed evolution campaign necessarily combines in vitro and in vivo steps. DNA libraries are produced as pools of variants using error-prone polymerase chain reaction (PCR), gene shuffling, molecular breeding, and other random mutagenesis methods known in the art. These steps are generally performed in vitro. These pooled variants must be separated prior to expression. This is accomplished by a transformation step, in vivo, followed by the picking of thousands of genetically distinct colonies. To ensure that the full library is sampled, oversampling by a factor of 2-3 is often necessary (Wu et al., PNAS 2019, 116:18, 8852-8858). In some cases, each colony is immediately induced to express the protein of interest. In other cases, the variant DNA sequence must be isolated from each colony and transformed again into an expression strain, in vivo. Alternatively, the DNA sequence can be used as a template for cell free protein synthesis (CFPS), in vitro. This process is time- and labor-intensive and takes several days to complete. Due to their activity and post-translational activation process, the directed evolution of cytotoxic enzymes such as zymogens has remained an unsolved challenge. A fully in vitro protocol for the directed evolution of proteins and enzymes is needed. Furthermore, a general HTP protocol for evolving potentially toxic or biocidal proteins, such as zymogens and proproteins, is a gap in the art addressed herein.

Researchers have explored directed evolution of the zymogen transglutaminase (Tgase) with limited success: Novo Nordisk (Zhao, et al., Journal of biomolecular screening 2010, 15, 206-212), Codexis (Nazor et al., WO02019094301A1), and Dophen (Hu et al., Transglutaminase for Protein Drug Modification: Pegylation and Beyond In Biocatalysis for Green Chemistry and Chemical Process Development; John Wiley & Sons, Inc., 2011, 10.1002/9781118028308.ch6, 151-172; Hu et al., WO2015191883). Complications with consistent expression of mutant enzymes have resulted in limited success of improving the activity of the enzyme. Instead, these programs focused on improvements in stability of the enzyme or modification of substrate specificity of the enzyme.

HTP expression of fully functional Tgase remains an unsolved challenge (Rachel, et al., Biomolecules 2013, 3, 870), in part because it is widely believed that the pro-sequence is required for proper folding of Tgase to afford catalytically active Tgase (Yurimoto, et al., Bioscience, Biotechnology, and Biochemistry 2004, 68, 2058-2069). Improved methods for HTP expression of catalytically active Tgase are needed.

BRIEF SUMMARY OF THE INVENTION

Methods are provided herein for cell free protein expression of the mature forms of proproteins and zymogens that eliminate the need for post-translational processing, such as cleavage of a pro-sequence, e.g., by an endogenous or exogenous protease, to afford active, mature protein. Non-limiting examples of an expressed zymogen include a transglutaminase, a laccase, a peroxidase, a transferase, a lysyl oxidase, a tyrosinase, a lipase, a peptidase, or a protease. A non-limiting example of an expressed proprotein includes insulin.

Methods are also provided for high throughput cell free protein engineering and expression of mature forms of zymogens (e.g., Tgase) or proproteins, in which post-translational cleavage of a pro-sequence, such as cleavage by an endogenous or exogenous protease, to afford active, mature enzyme (e.g., Tgase) or protein is not required.

High throughput methods are also provided for installing single or multiple point mutations in a gene encoding a mature enzyme (e.g. Tgase) or protein. In some embodiments, DNA libraries of mature enzyme (e.g., Tgase) or protein variants are expressed using cell free methods.

The disclosed methods represent six strategies for Cell Free Protein Synthesis (CFPS) of active, mature enzyme (e.g., Tgase) or protein, without the need for post-translational cleavage of the pro-sequence.

In some embodiments, an in vitro system is employed, such as a supplemented cell extract, for example, energy buffer and amino acid supplemented cell extract, for expression of the active, mature form of a zymogen or proprotein without post-translational cleavage of the pro-sequence, to enable engineering of the enzyme. A non-limiting example is the expression of active, mature Tgase.

In some embodiments, supplemented cell extracts from E. coli are employed for expression of the active, mature enzyme (e.g., Tgase) or protein. For example, the supplemented cell extracts from E. coli may be purchased from commercially available sources, or may be prepared de novo, e.g., as described by Garamella, et al. (2016) ACS Synthetic Biology 5:344; Caschera, et al. (2018) ACS Synthetic Biology 7:2841; or Kwon, et al. (2015) Sci Rep 5:8663.

In some embodiments, enzymes, such as Tgase or variants thereof, with biocidal or cytotoxic properties, are expressed using CFPS methods as described herein.

In one aspect, a method is provided for expressing a mature, active polypeptide form of a zymogen or a proprotein. The method includes: expressing a DNA template that contains a DNA sequence that encodes a mature polypeptide sequence of a zymogen or proprotein in a cell free expression system that is capable of in vitro transcription from the DNA template and in vitro translation. Expression in the cell free expression system produces the mature enzyme or protein polypeptide, and the method does not include post-translational processing steps or cleavage of a pro-sequence from the zymogen or proprotein to produce the mature polypeptide.

In some embodiments, the cell free expression system includes:

(a) an energy mix that includes one or more of: polysaccharides, rNTPs, tRNA, CoA, NAD, cAMP, folinic acid, spermidine, and 3-PGA; and (b) amino acids.

In some embodiments, the cell free expression system includes a cell free extract derived from eukaryotic or prokaryotic cells. For example, the cell free expression system may include an extract of bacterial cells, such as, but not limited to, comprise E. coli cells.

In some embodiments, the mature polypeptide is expressed as a discrete polypeptide without a pro-sequence.

In some embodiments, the mature polypeptide is expressed in the presence of a polypeptide pro-sequence for the zymogen or proprotein. In one embodiment, a DNA sequence that encodes the pro-sequence and the DNA sequence that encodes the mature polypeptide sequence of the zymogen or proprotein are expressed as discrete polypeptide sequences from the same DNA template. In another embodiment, the DNA sequence that encodes the mature polypeptide is expressed from a first DNA template, and the DNA sequence that encodes the pro-sequence is expressed from a separate second DNA template. In another embodiment, the pro-sequence is synthesized chemically and added to the cell free expression system prior to, during, or after expression of the mature polypeptide.

In some embodiments, the method includes expressing a mature transglutaminase enzyme (ED 2.3.2.13) in an active, mature form. For example, the transglutaminase enzyme may be, but is not limited to, a bacterial transglutaminase, such as the transglutaminase from Streptomyces mobaraensis (SEQ ID NO:7) or a variant thereof.

In some embodiments, the DNA template that encodes the mature polypeptide sequence of a zymogen or proprotein is contained within an expression vector or a linear DNA fragment. In some embodiments, the DNA template that encodes the mature polypeptide sequence and the pro-sequence polypeptide of a zymogen or proprotein are both contained within a bicistronic expression vector or a linear DNA fragment. In some embodiments, a first DNA template that encodes the mature polypeptide sequence and a second DNA template that encodes the pro-sequence are contained within separate expression vectors or linear DNA fragments.

In some embodiments, an expression vector that encodes the mature polypeptide and/or the pro-sequence is a plasmid derived from pBR322, a pUC vector, or a pET vector.

In some embodiments, a linear DNA fragment that encodes the mature polypeptide and/or the pro-sequence is produced by de novo synthesis or by amplification of a template nucleic acid sequence that encodes the mature polypeptide and/or the pro-sequence. For example, the amplification may be, but is not limited to, polymerase chain reaction (PCR). the linear DNA fragment includes a purification step. Purification may be performed in the presence of a nuclease inhibitor, such as, but not limited to, GamS.

In some embodiments, the cell free expression system further includes a reversible inhibitor of an activity of the mature zymogen or proprotein.

In some embodiments, the mature polypeptide sequence of the zymogen or proprotein comprises is a variant that includes one or more mutation in comparison to the wild type sequence of the zymogen or proprotein. In some embodiments, the method includes: introducing one or more mutation into a DNA sequence that encodes the mature polypeptide sequence of a zymogen or proprotein; and expressing the DNA template in the cell free expression system, thereby producing a variant mature polypeptide with one or more mutation relative to the wild type sequence of the mature zymogen or proprotein.

In another aspect, a mature, active polypeptide of a zymogen or proprotein is provided, which is produced according any of the cell free expression methods described herein.

In another aspect, a cell free expression system is provided for expression of a mature, active form of a zymogen or a proprotein. The cell free expression system includes: (a) a nucleic acid template that contains a DNA sequence that encodes a mature polypeptide sequence of a zymogen or proprotein; and (b) a cell free reaction mixture that is capable of in vitro transcription from the DNA template and in vitro translation to produce the mature polypeptide. In some embodiments, the cell free reaction mixture includes: (a) an energy mix that includes one or more of: polysaccharides, rNTPs, tRNA, CoA, NAD, cAMP, folinic acid, spermidine, and 3-PGA; and (b) amino acids. In some embodiments, the cell free expression system includes a cell free extract derived from eukaryotic or prokaryotic cells. For example, the cell free expression system may include an extract of bacterial cells, such as, but not limited to, comprise E. coli cells.

In some embodiments, the DNA template encodes a mature sequence of a transglutaminase enzyme (ED 2.3.2.13), such as, but not limited to, a bacterial transglutaminase, such as the transglutaminase from Streptomyces mobaraensis (SEQ ID NO:7) or a variant thereof.

In some embodiments, the cell free expression system further includes a DNA template that contains a DNA sequence that encodes a pro-sequence for the zymogen or proprotein. In one embodiment, the DNA sequence that encodes the mature polypeptide and the DNA sequence that encodes the pro-sequence of a zymogen or proprotein are encoded as discrete sequences on the same DNA template. In another embodiment, the DNA sequence that encodes the mature polypeptide is encoded on a first DNA template and the DNA sequence that encodes the pro-sequence is encoded on separate second DNA template.

In some embodiments, the cell free expression system includes a chemically synthesized polypeptide pro-sequence for the zymogen or proprotein, which is added to the cell free expression system prior to, during, or after production of the mature polypeptide.

In some embodiments, the cell free expression system includes a reversible inhibitor of an activity of the mature enzyme or protein, which is added to the cell free expression system prior to, during, or after production of the mature polypeptide.

In some embodiments, the mature polypeptide sequence of the zymogen or proprotein that is encoded by the DNA template in the cell free expression system is a variant that includes one or more mutation in comparison to the wild type sequence of the zymogen or proprotein.

In another aspect, expression vector or linear DNA fragment is provided, which includes a DNA sequence that encodes a mature polypeptide for a zymogen or proprotein and a DNA sequence that encodes a pro-sequence for the zymogen or proprotein, encoded as discrete sequences and separated by nucleic acid sequences, such as, but not limited to a stop codon, a spacer region, a ribosome binding site, and a second start codon.

In another aspect, a composition is provided that includes: (a) a first expression vector or linear DNA fragment that includes a DNA sequence that encodes a mature polypeptide for a zymogen or proprotein; and (b) a second expression vector or linear DNA fragment that includes a DNA sequence that encodes the pro-sequence for the zymogen or proprotein.

In another aspect, a method for high-throughput engineering of variants of a zymogen or proprotein is provided. The method includes: (i) introducing one or more mutation into a DNA sequence that encodes a mature polypeptide of a zymogen or proprotein, wherein the one or more mutation results in a change in one or more amino acid in the polypeptide sequence, thereby producing a DNA sequence that encodes a variant of the mature polypeptide; and (ii) expressing the DNA sequence in a cell free expression system as described herein, thereby producing a variant of the mature polypeptide. In some embodiments, the variant DNA sequence is produced by site-directed mutagenesis or de novo DNA synthesis. In some embodiments, the method includes producing a plurality of different DNA sequences that encode different variants of mature polypeptides for a zymogen or proprotein, and expressing each of the different DNA sequences in individual cell free expression systems, thereby producing a plurality of different variant mature polypeptides.

In some embodiments, the method includes assessing one or more property of one or more variant mature polypeptide, in comparison to the mature form of the zymogen or proprotein from which the variant(s) are derived. For example, the property may include, but is not limited to, enzymatic activity and/or stability.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a bicistronic expression vector for CFPS reactions expressed using E. coli, an example sequence for which is depicted in SEQ ID NO: 2.

FIGS. 2A-2B show the designs of two separate plasmids for pro-sequence and mature Tgase, which may be expressed concurrently in CFPS reactions. FIG. 2A shows the “mature Tgase plasmid,” an example for which is the DNA fragment depicted SEQ ID NO: 3 cloned onto a pUC19-derived vector. FIG. 2B shows the “pro-sequence plasmid,” an example for which is the DNA fragment depicted SEQ ID NO: 4, cloned onto a pUC19-derived vector.

FIG. 3 . Activity of mature Tgase expressed using cell free methods: (1) Bicistronic Vector—expression of the pro-sequence gene and mature Tgase gene using the method described in Example 1; (2) Dual Vectors—expression of the pro-sequence gene and mature Tgase gene using the dual-plasmid expression method described in Example 2; (3) Mature Tgase only—expression of mature Tgase gene in the absence of a pro-sequence as described in Example 5; (4) Ammonium Sulfate—expression of mature Tgase gene in the presence of ammonium sulfate as described in Example 4; (5) Synthetic Pro-Sequence —expression of mature Tgase gene in the presence of a chemically synthesized pro-sequence peptide as described in Example 3. Activity was determined using the hydroxamate assay, and the results are normalized. The Synthetic Pro-Sequence was defined as 100% activity and all other conditions are reported as their percent activity relative to the Synthetic Pro-Sequence condition.

FIG. 4 . CFPS (in vitro transcription/translation) expression of mature Tgase in the presence of ammonium sulfate ((NH₄)₂SO₄) as described in Example 5. The concentration of ammonium sulfate was varied in the CFPS reaction. The expression level of active mature Tgase was determined using the hydroxamate assay and normalized. 10 mM ammonium sulfate was defined as 100% activity and all other conditions were defined as their percent activity relative to 10 mM ammonium sulfate.

FIGS. 5A-5C. Residual bacterial or fungal cell viability after exposure to a mature Tgase variant (SEQ ID NO: 6) or wild-type mature Tgase (SEQ ID NO: 7) as described in Example 7. Where no bar is visible, no viable cells were detected. FIG. 5A shows residual cell viability of two strains of the Gram-negative bacterium E. coli after exposure to 0.07 weight percent mature Tgase variant or no mature Tgase. FIG. 5B shows residual cell viability of B. subtilis after exposure to 0.07 weight percent mature Tgase variant or no mature Tgase. FIG. 5C shows residual cell viability of C. albicans after exposure to 0.08 weight percent mature Tgase variant, 0.08 weight percent wild-type mature Tgase, or no Tgase.

DETAILED DESCRIPTION

The subject of this disclosure is the development of a HTP expression system for wild type and mutant variants of the active mature forms of zymogens or proproteins, that do not require post translational cleavage of the pro-sequence. Described herein is a general method for the directed evolution of the mature forms of proteins and enzymes that contain pro-sequences in their wild-type forms. As described herein, the mature forms of zymogens (e.g., Tgase) or proproteins may be expressed using a cell free expression system. Further described herein is a method of producing the mature form of a zymogen (e.g., Tgase) or proprotein that does not require post-translational modification.

Methods are described herein that may be applied to a fully cell free directed evolution campaign. In some embodiments, each variant is created separately using site directed mutagenesis (Carey et al. (2013) Cold Spring Harb Protoc. 2013(8), 738-42, Strain-Damerell et al. (2019) Methods Mol Biol. 2025, 281-296, Q5® Site-Directed Mutagenesis Kit, NEB). In other embodiments, a variant is created using de novo DNA synthesis of mutant fragments which are assembled into the full-length mutant gene by Golden Gate Assembly (Engler et al. (2008) PLoS One 3(11), e3647), Gibson Assembly (Gibson et al. (2009) Nature Methods 6, 343-345), or other molecular cloning methods known in the art to assemble the full mutant gene. The mutant gene product of the above steps is amplified, e.g., by polymerase chain reaction (PCR), and used as the template for CFPS. In some embodiments, the amplified gene is purified prior to use in a CFPS method as described herein. The removal of all biological (in vivo) steps streamlines the process and a library of mutant genes can be constructed in a single day. This is a significant improvement to the current art, which employs in vivo steps, resulting in timelines of over 7 days to create and express libraries of DNA. Furthermore, by avoiding the constraints of cellular metabolism, the CFPS step can be used to evolve enzymes for potentially toxic (e.g., biocidal or cytotoxic) activity.

It is a further object of the invention to use the disclosed methods for the directed evolution of enzymes and proteins derived from zymogens or proproteins for biocidal or therapeutic activity.

Unless defined otherwise herein, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Singleton, et al., Dictionary of Microbiology and Molecular Biology, second ed., John Wiley and Sons, New York (1994), and Hale & Markham, The Harper Collins Dictionary of Biology, Harper Perennial, NY (1991) provide one of skill with a general dictionary of many of the terms used in this invention. Any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention.

Numeric ranges provided herein are inclusive of the numbers defining the range.

I. Definitions

“A,” “an” and “the” include plural references unless the context clearly dictates otherwise.

The term “about” is used herein to mean plus or minus ten percent (10%) of a value. For example, “about 100” refers to any number between 90 and 110.

The term “amino acid” refers to a molecule containing both an amine group and a carboxyl group that are bound to a carbon, which is designated the alpha-carbon. Suitable amino acids include, without limitation, both the D- and L-isomers of the naturally occurring amino acids, as well as non-naturally occurring amino acids prepared by organic synthesis or other metabolic routes. In some embodiments, a single “amino acid” might have multiple sidechain moieties, as available per an extended aliphatic or aromatic backbone scaffold. Unless the context specifically indicates otherwise, the term amino acid, as used herein, is intended to include amino acid analogs.

“Amino acid supplements” are a combination (mixture) of amino acids, such as all 20 natural amino acids, often at a ratio tailored to the composition of the protein being synthesized. (See, e.g., Caschera, et al. (2014) Biochimie 99:162)

The phrase “and/or,” as used herein in the specification and in the claims, should be understood to mean “either or both” of the elements so conjoined, i.e., elements that are conjunctively present in some cases and disjunctively present in other cases. Other elements may optionally be present other than the elements specifically identified by the “and/or” clause, whether related or unrelated to those elements specifically identified unless clearly indicated to the contrary. Thus, as a non-limiting example, a reference to “A and/or B,” when used in conjunction with open-ended language such as “comprising” can refer, in one embodiment, to A without B (optionally including elements other than B); in another embodiment, to B without A (optionally including elements other than A); in yet another embodiment, to both A and B (optionally including other elements); etc.

The term “base pair” or “bp” as used herein refers to a partnership (i.e., hydrogen bonded pairing) of adenine (A) with thymine (T), or of cytosine (C) with guanine (G) in a double stranded DNA molecule. In some embodiments, a base pair may include A paired with Uracil (U), for example, in a DNA/RNA duplex.

“Cell free protein synthesis” or “CFPS” refers to production of protein in an in vitro, cell free system, using the transcription and protein synthesis machinery of a cell, but without the use of living cells. In some embodiments, a cellular extract is used for CFPS, such as a bacterial cell extract.

In general, a “complement” of a given nucleic acid sequence is a sequence that is fully complementary to and hybridizable to the given sequence. In general, a first sequence that is hybridizable to a second sequence or set of second sequences is specifically or selectively hybridizable to the second sequence or set of second sequences, such that hybridization to the second sequence or set of second sequences is preferred (e.g., thermodynamically more stable under a given set of conditions, such as stringent conditions commonly used in the art) in comparison with hybridization with non-target sequences during a hybridization reaction. Typically, hybridizable sequences share a degree of sequence complementarity over all or a portion of their respective lengths, such as 25%-100% complementarity, including at least about 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, and 100% sequence complementarity.

The term “complementary” herein refers to the broad concept of sequence complementarity in duplex regions of a single polynucleotide strand or between two polynucleotide strands between pairs of nucleotides through base-pairing. It is known that an adenine nucleotide is capable of forming specific hydrogen bonds (“base pairing”) with a thymine or uracil nucleotide. Similarly, it is known that a cytosine nucleotide is capable of base pairing with a guanine nucleotide. However, in certain circumstances, hydrogen bonds may also form between other pairs of bases, e.g., between adenine and cytosine, etc. “Essentially complementary” or “substantially complementary” herein refers to sequence complementarity in duplex regions of a single polynucleotide strand or between two polynucleotide strands that is incomplete complementarity but in which stability of the duplex region is retained, for example, wherein the complementarity is less than 100% but is greater than about 90%.

The term “derived from” encompasses the terms “originated from,” “obtained from,” “obtainable from,” “isolated from,” “purified from,” and “created from,” and generally indicates that one specified material finds its origin in another specified material or has features that can be described with reference to another specified material.

The term “duplex” herein refers to a region of complementarity that exists between two polynucleotide sequences. The term “duplex region” refers to the region of sequence complementarity that exists between two oligonucleotides or two portions of a single oligonucleotide.

The term “energy mix” refers to a solution containing one or more cofactor, such as rNTPs, tRNAs, Coenzyme A (CoA), nicotine adenine dinucleotide (NAD), folinic acid, spermidine, 3-phosphoglyceric acid (3-PGA), and/or mono-, di-, oligo-, and/or polysaccharides. The energy mix promotes recycling of byproducts of in vitro transcription and translation and regeneration of ATP, improving the overall protein yield of the expression system. (See, e.g., Caschera, et al. (2014) Biochimie 99:162.)

As used herein, the term “expression” refers to the process by which a polypeptide is produced based on the nucleic acid sequence of a gene. The process includes both transcription and translation.

As used herein, an “expression vector” refers to a nucleic acid (e.g., DNA) construct, generated recombinantly or synthetically, with a series of specified nucleic acid elements that permit transcription of a particular nucleic acid. The expression vector can be part of a plasmid, virus, or nucleic acid fragment. A number of bacterial expression vectors are available commercially and through the American Type Culture Collection (ATCC), Rockville, Md. An expression vector may refer to a DNA construct containing a DNA coding sequence (e.g., gene sequence) that is operably linked to one or more suitable control sequence(s) capable of effecting expression of the coding sequence in a cell free protein synthesis system as described herein. Such control sequences include a promoter to effect transcription, an optional operator sequence to control such transcription, a sequence encoding suitable mRNA ribosome binding sites, and sequences which control termination of transcription and translation. The vector may be a plasmid, a phage particle, or simply a potential genomic insert. Once transformed into a suitable host, the vector may replicate and function independently of the host genome, or may, in some instances, integrate into the genome itself. The plasmid is the most commonly used form of expression vector. However, the invention is intended to include such other forms of expression vectors that serve equivalent functions, and which are, or become, known in the art.

The terms “first end” and “second end” when used in reference to a nucleic acid molecule, herein refers to ends of a linear nucleic acid molecule.

A “gene” refers to a DNA segment that is involved in producing a polypeptide and includes regions preceding and following the coding regions as well as intervening sequences (introns) between individual coding segments (exons).

“Hybridization” and “annealing” refer to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues. The hydrogen bonding may occur by Watson Crick base pairing, Hoogstein binding, or in any other sequence specific manner. The complex may include two nucleic acid strands forming a duplex structure, three or more strands forming a multi-stranded complex, a single self-hybridizing strand, or any combination of these. A hybridization reaction may constitute a step in a more extensive process, such as the initiation of polymerase chain reaction (PCR), ligation reaction, sequencing reaction, or cleavage reaction, e.g., enzymatic cleavage of a polynucleotide by a ribozyme. A first nucleic acid sequence that can be stabilized via hydrogen bonding with the bases of the nucleotide residues of a second sequence is said to be “hybridizable” to the second sequence. In such a case, the second sequence can also be said to be hybridizable to the first sequence. The term “hybridized” refers to a polynucleotide in a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues.

The terms “isolated,” “purified,” “separated,” and “recovered” as used herein refer to a material (e.g., a protein, nucleic acid, or cell) that is removed from at least one component with which it is naturally associated, for example, at a concentration of at least 90% by weight, or at least 95% by weight, or at least 98% by weight of the sample in which it is contained. For example, these terms may refer to a material which is substantially or essentially free from components which normally accompany it as found in its native state, such as, for example, an intact biological system. An isolated nucleic acid molecule includes a nucleic acid molecule contained in cells that ordinarily express the nucleic acid molecule, but the nucleic acid molecule is present extrachromosomally or at a chromosomal location that is different from its natural chromosomal location.

A “mature” polypeptide, protein or enzyme refers to the activated form of a zymogen or proprotein following cleavage of its pro-sequence or expressed without its pro-sequence. In some embodiments, the mature enzyme may be produced as a separate polypeptide from the pro-sequence in order to eliminate a post-translational processing (activation) step.

“Microbial transglutaminase” (Tgase, EC 2.3.2.13) is one of the most extensively studied industrial enzymes for protein functionalization and protein cross-linking because of its ability to polymerize or functionalize proteins through the formation of a stable ε-(γ-glutamyl)lysine isopeptide bond without the constraint of a consensus sequence or additional cofactors. The most widely studied Tgase, Streptomyces mobaraensis Tgase, is currently produced on the industrial scale by fermentation of the wild type strain S. mobaraensis as an extracellular protein. The HTP production of this enzyme is complicated by a 46-residue N-terminal pro-sequence that must be cleaved to render Tgase functional.

The term “mutation” herein refers to a change introduced into a parental sequence, including, but not limited to, substitutions, insertions, and deletions (including truncations). The consequences of a mutation include, but are not limited to, the creation of a new character, property, function, phenotype or trait not found in the protein encoded by the parental sequence.

The term “nucleotide” herein refers to a monomeric unit of DNA or RNA consisting of a sugar moiety (pentose), a phosphate, and a nitrogenous heterocyclic base. The base is linked to the sugar moiety via the glycosidic carbon (1′ carbon of the pentose) and that combination of base and sugar is a nucleoside. When the nucleoside contains a phosphate group bonded to the 3′ or 5′ position of the pentose it is referred to as a nucleotide. A sequence of polymeric operatively linked nucleotides is typically referred to herein as a “base sequence,” “nucleotide sequence,” “polynucleotide sequence,” “oligonucleotide sequence”, or nucleic acid or polynucleotide “strand,” and is represented herein by a formula whose left to right orientation is in the conventional direction of 5′-terminus to 3′-terminus, referring to the terminal 5′ phosphate group and the terminal 3′ hydroxyl group at the “5′” and “3′” ends of the polymeric sequence, respectively.

The term “nucleotide analog” herein refers to analogs of nucleoside triphosphates, of the common nucleobases: adenine, cytosine, guanine, uracil, and thymidine, e.g., (S)-Glycerol nucleoside triphosphates (gNTPs) (Horhota et al. (2006) Organic Letters, 8:5345-5347). Also encompassed are nucleoside tetraphosphate, nucleoside pentaphosphates and nucleoside hexaphosphates.

The term “operably linked” refers to a juxtaposition or arrangement of specified nucleic acid sequence elements that allows them to perform in concert to bring about an effect. For example, a promoter is operably linked to a coding sequence if it controls the transcription of the coding sequence.

As used herein in the specification and in the claims, “or” should be understood to have the same meaning as “and/or” as defined above. For example, when separating items in a list, “or” or “and/or” shall be interpreted as being inclusive, i.e., the inclusion of at least one, but also including more than one, of a number or list of elements, and, optionally, additional unlisted items. Only terms clearly indicated to the contrary, such as “only one of” or “exactly one of” or, when used in the claims, “consisting of” will refer to the inclusion of exactly one element of a number or list of elements. In general, the term “or” as used herein shall only be interpreted as indicating exclusive alternatives (i.e. “one or the other but not both”) when preceded by terms of exclusivity, such as “either,” “one of” “only one of” or “exactly one of” “Consisting essentially of” when used in the claims, shall have its ordinary meaning as used in the field of patent law.

“Oligopeptide” refers to a peptide that contains a relatively small number of amino-acid residues, for example, about 2 to about 20 amino acids.

“Peptide” a compound consisting of two or more amino acids linked in a chain, the carboxyl group of each acid being joined to the amino group of the next by a bond of the type R—OC—NH—R′, for example, about 2 to about 50 amino acids.

The term “polymerase” herein refers to an enzyme that catalyzes the polymerization of nucleotides (i.e., the polymerase activity). The term polymerase encompasses DNA polymerases, RNA polymerases, and reverse transcriptases. A “DNA polymerase” catalyzes the polymerization of deoxyribonucleotides. An “RNA polymerase” catalyzes the polymerization of ribonucleotides. A “reverse transcriptase” catalyzes the polymerization of deoxyribonucleotides that are complementary to an RNA template.

The terms “polynucleotide,” “nucleic acid,” and “oligonucleotide” are used interchangeably. They refer to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof. Polynucleotides may have any three-dimensional structure, and may perform any function, known or unknown, may be single- or multi-stranded (e.g., single-stranded, double-stranded, triple-helical, etc.), and may contain deoxyribonucleotides, ribonucleotides, and/or analogs or modified forms of deoxyribonucleotides or ribonucleotides, including modified nucleotides or bases or their analogs. Because the genetic code is degenerate, more than one codon may be used to encode a particular amino acid, and the present invention encompasses polynucleotides which encode a particular amino acid sequence. Any type of modified nucleotide or nucleotide analog may be used, so long as the polynucleotide retains the desired functionality under conditions of use, including modifications that increase nuclease resistance (e.g., deoxy, 2′-O-Me, phosphorothioates, etc.). Labels may also be incorporated for purposes of detection or capture, for example, radioactive or nonradioactive labels or anchors, e.g., biotin. The term polynucleotide also includes peptide nucleic acids (PNA). Polynucleotides may be naturally occurring or non-naturally occurring. Polynucleotides may contain RNA, DNA, or both, and/or modified forms and/or analogs thereof. A sequence of nucleotides may be interrupted by non-nucleotide components. One or more phosphodiester linkages may be replaced by alternative linking groups. These alternative linking groups include, but are not limited to, embodiments wherein phosphate is replaced by P(O)S (“thioate”), P(S)S (“dithioate”), (O)NR₂ (“amidate”), P(O)R, P(O)OR′, CO or CH₂ (“formacetal”), in which each R or R′ is independently H or substituted or unsubstituted alkyl (1-20 C) optionally containing an ether (—O—) linkage, aryl, alkenyl, cycloalkyl, cycloalkenyl or araldyl. Not all linkages in a polynucleotide need and circular portions. The following are nonlimiting examples of polynucleotides: coding or non-coding regions of a gene or gene fragment, intergenic DNA, loci (locus) defined from linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, short interfering RNA (siRNA), short-hairpin RNA (shRNA), micro-RNA (miRNA), small nucleolar RNA, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, adapters, and primers. A polynucleotide may include modified nucleotides, such as methylated nucleotides and nucleotide analogs. If present, modifications to the nucleotide structure may be imparted before or after assembly of the polymer. The sequence of nucleotides may be interrupted by non-nucleotide components. A polynucleotide may be further modified after polymerization, such as by conjugation with a labeling component, tag, reactive moiety, or binding partner. Polynucleotide sequences, when provided, are listed in the 5′ to 3′ direction, unless stated otherwise.

As used herein, “polypeptide” refers to a composition comprised of amino acids and recognized as a protein by those of skill in the art. The conventional one-letter or three-letter code for amino acid residues is used herein. The terms “polypeptide” and “protein” are used interchangeably herein to refer to polymers of amino acids of any length. The polymer may be linear or branched, it may comprise modified amino acids, and it may be interrupted by non-amino acids. The terms also encompass an amino acid polymer that has been modified naturally or by intervention; for example, disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, or any other manipulation or modification, such as conjugation with a labeling component. Also, included within the definition are, for example, polypeptides containing one or more analogs of an amino acid (including, for example, unnatural amino acids, etc.), as well as other modifications known in the art.

A “promoter” refers to a regulatory sequence that is involved in initiating transcription of a gene by RNA polymerase. A promoter may be an inducible promoter or a constitutive promoter. An “inducible promoter” is a promoter that is active under environmental or developmental regulatory conditions.

A “proprotein” refers to a protein precursor that is cleaved to form an active protein. A mature proprotein refers to the activated form of the proprotein following cleavage of its pro-sequence or in the absence of the pro-sequence. In some embodiments, the mature proprotein may be produced as a separate polypeptide from the pro-sequence in order to eliminate a post-translational processing (activation) step.

A “pro-sequence” refers to a polypeptide sequence within an expressed protein, e.g., a zymogen or proprotein, which is typically cleaved from the protein to produce an active protein, such as an enzyme. In some embodiments, a pro-sequence may be essential for correct folding of the protein. In some embodiments, cleavage of the pro-sequence results in transition of an inactive enzyme to active enzyme.

The term “recombinant,” refers to genetic material (i.e., nucleic acids, the polypeptides they encode, and vectors and cells comprising such polynucleotides) that has been modified to alter its sequence or expression characteristics, such as by mutating the coding sequence to produce an altered polypeptide, fusing the coding sequence to that of another gene, placing a gene under the control of a different promoter, expressing a gene in a heterologous organism, expressing a gene at a decreased or elevated levels, expressing a gene conditionally or constitutively in manner different from its natural expression profile, and the like. Generally recombinant nucleic acids, polypeptides, and cells based thereon, have been manipulated by man such that they are not identical to related nucleic acids, polypeptides, and cells found in nature. A recombinant cell may also be referred to as “engineered.”

The phrases “substantially similar” and “substantially identical” in the context of at least two nucleic acids typically means that a polynucleotide includes a sequence that has at least about 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or even 99.5% sequence identity, in comparison with a reference (e.g., wild-type) polynucleotide or polypeptide. Sequence identity may be determined using known programs such as BLAST, ALIGN, and CLUSTAL using standard parameters. (See, e.g., Altshul et al. (1990) J. Mol. Biol. 215:403-410; Henikoff et al. (1989) Proc. Natl. Acad. Sci. 89:10915; Karin et al. (1993) Proc. Natl. Acad. Sci. 90:5873; and Higgins et al. (1988) Gene 73:237). Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information. Also, databases may be searched using FASTA (Pearson et al. (1988) Proc. Natl. Acad. Sci. 85:2444-2448.) In some embodiments, substantially identical nucleic acid molecules hybridize to each other under stringent conditions (e.g., within a range of medium to high stringency).

Nucleic acid “synthesis” herein refers to any in vitro method for making a new strand of polynucleotide or elongating an existing polynucleotide (i.e., DNA or RNA) in a template dependent manner. Synthesis, according to the invention, can include amplification, which increases the number of copies of a polynucleotide template sequence with the use of a polymerase. Polynucleotide synthesis (e.g., amplification) results in the incorporation of nucleotides into a polynucleotide (e.g., extension from a primer), thereby forming a new polynucleotide molecule complementary to the polynucleotide template. The formed polynucleotide molecule and its template can be used as templates to synthesize additional polynucleotide molecules. “DNA synthesis,” as used herein, includes, but is not limited to, polymerase chain reaction (PCR), and may include the use of labeled nucleotides, e.g., for probes and oligonucleotide primers, or for polynucleotide sequencing.

“Under transcriptional control” is a term well understood in the art that indicates that transcription of a polynucleotide sequence depends on its being operably linked to an element which contributes to the initiation of, or promotes transcription.

Related (and derivative) proteins encompass “variant” proteins. Variant proteins differ from another (i.e., parental or wild-type) protein and/or from one another by a small number of amino acid residues. A variant may include one or more amino acid mutations (e.g., amino acid deletion, insertion or substitution) as compared to the parental protein from which it is derived. In some embodiments, the number of different amino acid residues is any of about 1, 2, 3, 4, 5, 10, 20, 25, 30, 35, 40, 45, or 50. In some embodiments, variants differ by about 1 to about 10 amino acids. Alternatively or additionally, variants may have a specified degree of sequence identity with a reference protein or nucleic acid, e.g., as determined using a sequence alignment tool, such as BLAST, ALIGN, and CLUSTAL (see, infra). For example, variant proteins or nucleic acid may have at least about 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or even 99.5% amino acid sequence identity with a reference sequence (e.g., parental or wild-type sequence from which the variant protein is derived).

As used herein, “wild-type,” “native,” and “naturally-occurring” proteins are those found in nature. The term “wild-type sequence” refers to an amino acid or nucleic acid sequence that is found in nature or naturally occurring. In some embodiments, a wild-type sequence is the starting point of a protein engineering project, for example, production of variant proteins.

A “zymogen” or “proenzyme” refers to an inactive precursor of an enzyme, which may be converted into an active enzyme by catalytic action, such as via proteolytic cleavage of a pro-sequence. A mature form of a zymogen or proenzyme refers to the activated form of a zymogen or proenzyme following cleavage of its pro-sequence or in the absence of the pro-sequence. In some embodiments, the mature form of the zymogen or proenzyme may be produced as a separate polypeptide from the pro-sequence in order to eliminate a post-translational processing (activation) step.

II. Methods

Methods are provided for the expression of DNA encoding a zymogen (e.g., transglutaminase (EC 2.3.2.13)), such as the transglutaminase (Tgase) from Streptomyces mobaraensis (GenBank AF531437.1), as a mature, active polypeptide (e.g., properly folded, active polypeptide), in such a way to alleviate the need for post-translational cleavage of the pro-sequence, for example, via an endogenous or exogenous protease. In some embodiments, a mature polypeptide, expressed as described herein, has biocidal properties (e.g., Tgase). In some embodiments, a mature polypeptide, expressed as described herein, has therapeutic properties (e.g., insulin).

A. CFPS (In Vitro Transcription/Translation) of Mature Polypeptide

In vitro Cell Free Protein Synthesis (CFPS) systems are deployed in the methods described herein, for expression of nucleic acid (e.g., DNA or RNA) encoding a mature enzyme (e.g., Tgase) or protein polypeptide sequence. In some embodiments, the CFPS system is a cell extract, such as a prokaryotic or eukaryotic cell extract.

In some embodiments, the CFPS system is a supplemented cell extract, for example, an energy buffer (energy mix) and amino acid supplemented cell extract.

In some embodiments, the CFPS system is a supplemented bacterial cell extract, such as a supplemented E. coli cell extract, for example, an energy buffer and amino acid supplemented bacterial (e.g., E. coli) cell extract. Supplemented cell extracts from E. coli are commercially available (myTXTL® system, Arbor Bioscience, Inc.; TNT® System, Promega, Inc.) or may be prepared, for example, according to methods described by Garamella, et al. (2016) ACS Synthetic Biology 5: 344; Caschera, et al. (2018) ACS Synthetic Biology 7:2841; or Kwon, et al. (2015) Sci Rep 5:8663).

In some embodiments, the CFPS system is a reconstituted cell-free system, for example, purified macromolecules including ribosomes, RNA polymerase, transcription factors, aminoacyl-tRNA synthetases, energy regulation enzymes, and tRNAs and small molecules including amino acids and rNTPs (Tuckey et al. (2015) Curr Protoc Mol Biol. 108:16.31.1-16.31.2). Reconstituted cell-free systems are commercially available (e.g., PURExpress® In Vitro Protein Synthesis Kit, New England Biolabs, Inc.).

In some embodiments, the nucleic acid template that encodes the mature polypeptide is an RNA molecule.

In some embodiments, the nucleic acid template that encodes the mature polypeptide is a DNA molecule.

The DNA template that encodes the mature polypeptide may be present in the CFPS system at a concentration of about 0.1 nM to about 20 nM, about 0.1 nM to about 0.5 nM, about 0.1 nM to about 5 nM, about 0.5 nM to about 1 nM, about 1 nM to about 5 nM, about 5 nM to about 10 nM, about 10 nM to about 15 nM, or about 15 nM to about 20 nM. In some embodiments, the DNA encoding a mature polypeptide (e.g., Tgase) is expressed in the presence of an energy mixture (see, e.g., Caschera, et al. (2014) Biochimie 99:162) and/or additional amino acids (see, e.g., Caschera, et al. (2015) BioTechniques 58:40) at a DNA concentration of 0.1 nM to about 20 nM, about 0.1 nM to about 0.5 nM, about 0.1 nM to about 5 nM, about 0.5 nM to about 1 nM, about 1 nM to about 5 nM, about 5 nM to about 10 nM, about 10 nM to about 15 nM, or about 15 nM to about 20 nM. In some embodiments, a pro-sequence is included as a discrete polypeptide during the cell free expression of mature polypeptide (e.g., Tgase). In other embodiments, the pro-sequence is expressed as a discrete polypeptide from the same DNA template as the mature polypeptide (e.g., Tgase) (e.g., expressed from a vector or plasmid that encodes both the sequence of the mature polypeptide and the pro-sequence) or from a different DNA template in the CFPS in which the mature polypeptide is expressed (e.g., expressed from a vector or plasmid that encodes the pro-sequence and that is separate from the vector or plasmid that encodes the mature polypeptide).

In some embodiments, the DNA encoding mature polypeptide (e.g., Tgase) is expressed from a plasmid derived from pBR322, a pUC vector, or a pET vector. In some embodiments, the plasmid encoding mature polypeptide (e.g., Tgase) is derived from pUC and is included in the CFPS system at a concentration of about 0.1 nM to about 20 nM, about 0.1 nM to about 0.5 nM, about 0.1 nM to about 5 nM, about 0.5 nM to about 1 nM, about 1 nM to about 5 nM, about 5 nM to about 10 nM, about 10 nM to about 15 nM, or about 15 nM to about 20 nM.

In some embodiments, a linear DNA fragment encoding mature polypeptide (e.g., Tgase) and/or a linear DNA fragment encoding the pro-sequence for the polypeptide, is utilized for expression. In some embodiments, a nuclease inhibitor, such as, but not limited to, gamS, is added to stabilize (extend the half-life of) the linear DNA component.

In some embodiments of the CFPS methods, the cell free extract, energy mixture, amino acid supplement, and DNA encoding mature polypeptide (e.g., Tgase) are added sequentially, in any order, to generate the CFPS reaction mixture. In other embodiments, the cell free extract, energy mixture, and/or amino acid supplement are pre-mixed prior to addition of the DNA encoding mature polypeptide. In some embodiments, some of these components are pre-mixed and other components are added sequentially.

In some embodiments of the present invention, the CFPS reaction is incubated for about 4 hours to about 24 hours at about 10° C. to about 40° C. or about 20° C. In some embodiments, the reaction is incubated for 16 hours to 24 hours at 20° C.

In some embodiments, about 5 to about 20 U/mL, or any of at least about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 U/mL mature zymogen enzyme is produced in a CFPS system as described herein, where a unit is defined as the amount of enzyme that will convert 1 μmol of substrate to product per minute at 37° C. For example, in one nonlimiting embodiment, a mature transglutaminase enzyme is produced, and one unit of transglutaminase enzyme may be defined as the amount of transglutaminase that will convert 1 μmol of hydroxylamine to hydroxamate per minute at 37° C.

B. CFPS Strategies

Six strategies for CFPS methods to produce a mature, active polypeptide form of a zymogen or proprotein as described herein are provided below.

1. Single Bicistronic Vector for Cell Free Expression

A bicistronic vector (e.g., plasmid) may be constructed by synthesizing the gene for a zymogen or proprotein (e.g., Tgase) or proprotein with the pro-sequence and mature enzyme or protein separated by a stop codon, a spacer region, a ribosome binding site, and a second start codon (e.g., SEQ ID NO: 2). The method disclosed herein may use, for example, the expression vector outlined in FIG. 1 (E. coli), or other expression vectors known in the art. These vectors include a promoter, a ribosome binding site, and a terminator.

In one embodiment, the expression vector depicted in FIG. 1 may be utilized. Following the mature polypeptide sequence is a short peptide linker, a purification tag (e.g., hexa-histidine tag), a stop codon, and a terminator. This construct is produced by de novo DNA synthesis, and cloned into an expression vector, such as a commercially available E. coli plasmid, downstream from a strong promoter, e.g., T7 promoter of the T7 bacteriophage or the E. coli sigma 70 consensus sequence (PtacI).

The full bicistronic genetic circuit is expressed in a CFPS reaction mixture, e.g., using a bacterial extract, such as an extract derived from E. coli. Polypeptides may be produced, for example, by creating a cocktail of one or more plasmids with the desired polypeptide sequence, e.g., the plasmid depicted in FIG. 1 , and a CFPS reaction mixture as described herein or a cell free master mix from a commercially available source, e.g., PURExpress® In Vitro Protein Synthesis Kit or Sigma 70 Cell Free Master Mix (non-limiting examples include New England Biolabs, Arbor Biosciences), and incubating the reaction mixture at about 10° C. to about 40° C. for about 4 hours to about 24 hours. In some embodiments, an exogenous RNA polymerase (e.g., T7 RNA polymerase) or DNA encoding an exogenous RNA polymerase may be added to the CFPS reaction mixture. This method allows for the production of active, mature polypeptide and does not require cleavage of the leader sequence, such as by an exogenous or endogenous protease.

2. Dual Expression System

The second approach described herein is outlined in FIGS. 2A-2B. The pro-sequence and the mature sequence of a zymogen or proprotein are cloned into separate expression vectors, e.g., plasmids or linear DNA fragments, and both are expressed concurrently in the CFPS system.

In some embodiments of the present invention, the stoichiometry of the two expression vectors, e.g., plasmids or linear DNA fragments, is varied for the minimum concentration of pro-sequence needed to synthesize an active, mature enzyme, e.g., Tgase. The ratio of pro-sequence vector to mature enzyme vector may be as low as about 1:75 or as high as about 1:0.2, or any of about 1:60, 1:55, 1:50, 1:45, 1:40, 1:35, 1:30, 1:25, 1:20, 1:15, 1:10, 1:5, 1:1, or 1:0.5, or about 1:75 to about 1:50, 1:60 to about 1:35, about 1:40 to about 1:10, about 1:25 to about 1:5, about 1:5 to about 1:0.5, or about 1:1 to about 1:0.2.

3. Cell Free Expression of Mature Polypeptide with Exogenous Addition of Pro-Peptide

The third approach described herein uses the combination of an expression vector that encodes a mature enzyme or protein polypeptide, e.g., as outlined in FIG. 2A, and exogenous pro-peptide (e.g., synthetically produced pro-peptide) in wild-type or mutant form, e.g., expression vector that encodes mature Tgase and added pro-sequence peptide of Tgase, for example, depicted in SEQ ID NO: 5, added to the CFPS reaction mixture. The method includes the production of the mature polypeptide, e.g., Tgase, from an expression vector, such as the plasmid depicted in FIG. 2A or a linear DNA fragment, in combination with addition of exogenous pro-peptide, made synthetically by de novo peptide synthesis or expressed and purified from a cultured microorganism, in a CFPS reaction, allowing for production of active, mature polypeptide, for example, active, mature Tgase. The mature DNA sequence and pro-peptide may be pre-mixed and added together or may be added sequentially. The exogenous pro-peptide may be added at a concentration of about 10 nM to about 10 μM, about 10 nM to about 100 nM, about 100 nM to about 500 nM, about 500 nM to about 1 μM, or about 1 μM to about 10 μM.

4. Cell Free Expression of Mature Polypeptide with Exogenous Addition of a Reversible Inhibitor

The fourth approach described herein uses the combination of an expression vector that encodes a mature enzyme or protein polypeptide, e.g., as outlined in FIG. 2A, and a reversible small molecule inhibitor of the enzyme. The method includes the production of the mature polypeptide, e.g., Tgase, from an expression vector, such as a plasmid as depicted in FIG. 2A or a linear DNA fragment, in combination with a reversible inhibitor of enzyme activity, in a CFPS reaction, allowing for production of active, mature polypeptide, for example, active, mature Tgase. The mature DNA sequence and reversible inhibitor may be pre-mixed and added together or may be added sequentially. Non-limiting examples of reversible inhibitors, e.g., reversible inhibitors of Tgase, include ammonium sulfate and imidazole. In some embodiments, a reversible inhibitor is a molecule that binds to the enzyme non-covalently or that prevents catalytic activity of an enzyme. After expression of the mature polypeptide, the reversible inhibitor may be removed or dissociated from the enzyme to provide active enzyme when enzymatic activity is desired for an end application of use.

5. Cell Free Expression of Mature Polypeptide without Pro-Sequence or Inhibitor

The fifth approach described herein involves expression of a mature enzyme or protein polypeptide in the absence of its pro-sequence or a reversible inhibitor. The method includes the production of the active, mature polypeptide, e.g., Tgase, from an expression vector, such as the plasmid depicted in FIG. 2A, or a linear DNA fragment, with no addition of exogenous pro-sequence peptide, pro-sequence encoding nucleic acid, or reversible inhibitor, in a CFPS reaction mixture as described herein.

6. High Throughput Engineering and Expression of Variants of Mature Polypeptide in Cell Free Expression System

The sixth approach described herein is a high-throughput method of engineering of mature enzyme or protein polypeptides, with improved properties in comparison to the wild type polypeptide, such as improved activity or long term stability, by introducing specific mutations into the nucleic acid sequence that encodes the mature polypeptide and expressing that sequence using any of the CFPS methods and systems described herein. Mutant genes can be generated using site-directed mutagenesis, de novo DNA synthesis, molecular breeding, or any other methods known in the art. These genes can then be expressed in CFPS using any of the previous five methods.

III. Compositions

DNA templates, compositions containing DNA templates, and cell free expression system reaction mixtures are provided for use in the cell free expression systems and methods described herein, i.e., CFPS systems for in vitro transcription/translation of mature, active enzymes and proteins derived from zymogen and proprotein polypeptides.

A. Cell Free Expression Systems

Cell free expression systems are provided that include a reaction mixture that contains: (a) a nucleic acid template that includes a DNA sequence that encodes a mature polypeptide sequence of a zymogen or proprotein; and (b) components that support in vitro transcription from the DNA sequence and in vitro translation of the resulting transcript from the DNA sequence, to produce a mature enzymes and proteins derived from zymogen or proprotein polypeptide, such as an cellular extract (for example, a bacterial cell extract, such as an extract of E. coli cells). In some embodiments, the reaction mixture also includes an energy mix (e.g., containing polysaccharides, rNTPs, tRNA, Coenzyme A (CoA), cAMP, folinic acid, spermidine, and/or 3-phosphoglyceric acid (3-PGA). In some embodiments, the reaction mixture also includes amino acids. In some embodiments, the reaction mixture also includes a reversible inhibitor of an activity of the enzyme (such as a reversible inhibitor of enzymatic activity). In one embodiment, the expressed zymogen is active, mature Tgase, and the reversible inhibitor is ammonium sulfate or imidazole. Typically, the reaction mixture does not include a protease. The enzyme and protein, derived from zymogen or proprotein, is produced in the cell free expression system without cleavage of a pro-sequence and without addition or inclusion of a cleavage agent such as an endogenous or exogenous protease.

In some embodiments, the enzyme or protein is expressed in the cell free expression system in the absence of the native pro-sequence for the zymogen or proprotein. In other embodiments, the enzyme or protein is expressed concurrently with the pro-sequence or an artificially synthesized pro-sequence is included in the reaction mixture. In embodiments in which the enzyme or protein is expressed concurrently with the pro-sequence, a nucleic acid template that includes a DNA or RNA sequence that encodes the pro-sequence is included in the reaction mixture. The pro-sequence may be expressed from the same template as the mature enzyme or protein or from a second, separate template.

In some embodiments of the cell free expression systems described herein, the nucleic acid template that encodes a mature region of a zymogen encodes a transglutaminase enzyme, and a mature, active transglutaminase enzyme is produced in the reaction mixture, without cleavage of a pro-sequence from the mature, active polypeptide sequence, e.g., without addition or inclusion of a protease. The transglutaminase may be a wild-type transglutaminase, such as the transglutaminase from Streptomyces mobaraensis (SEQ ID NO:7), or a variant thereof.

B. Nucleic Acid Templates

Nucleic acid templates are provided that encode mature region and/or a pro-sequence for the encoded zymogen or proprotein. The nucleic acid template may be in the form of an expression vector or in the form of a linear DNA or RNA fragment. The template typically includes appropriate sequences for expression of the encoded polypeptide, such as promoter, operator, ribosome binding site, terminators of transcription and translation, etc.

In some embodiments, a bicistronic template includes DNA or RNA sequences and appropriate sequences for expression of both the mature enzyme or protein and the pro-sequence of a zymogen or proprotein as separately expressed polypeptides. The template may include a stop codon, a spacer region, a ribosome binding site, and a second start codon between the coding sequence for the mature enzyme or protein and the coding sequence for the pro-sequence. In some embodiments, separate different templates are provided for expression of the mature enzyme or protein and the pro-sequence.

A nucleic acid template may be in the form of an expression vector, such as, but not limited to, a plasmid derived from pBR322, a pUC vector, or a pET vector.

A nucleic acid template may be in the form of a linear DNA fragment, for example, produced via de novo synthesis or via an amplification reaction, such as PCR.

A nucleic acid template may be in the form of a linear RNA fragment, for example, produced via de novo synthesis or in vitro transcription.

C. Polypeptides

Polypeptides are provided that are produced in a cell free expression system (CFPS) as described herein. The polypeptides are active, mature enzymes and proteins derived from a zymogen or proprotein that are produced without activation by cleavage of a pro-sequence, such as without cleavage by a protease.

In some embodiments, the polypeptide is a mature, active enzyme derived from a zymogen, such as a transglutaminase, a laccase, a peroxidase, a transferase, a lysyl oxidase, a tyrosinase, a lipase, or a protease. In a non-limiting example, the polypeptide may be a mature, active transglutaminase enzyme, such as the transglutaminase from Streptomyces mobaraensis (SEQ ID NO:7) or a variant thereof. In some embodiments, the polypeptide has biocidal or cytotoxic properties. In some embodiments, the polypeptide has therapeutic properties (such as, but not limited to, insulin).

In some embodiments, the expressed polypeptide is a variant of a wild-type or a parental polypeptide sequence from which the expressed polypeptide sequence is derived and has greater activity (e.g., enzymatic, biocidal, cytotoxic, or therapeutic activity) and/or long-term stability than the wild-type or parental polypeptide.

IV. Anti-Microbial Products and Applications of Use

A mature polypeptide form of a zymogen, as described herein, may be used as an alternative or in addition to a conventional preservative, such as, but not limited to, parabens, formaldehyde, and glutaraldehyde and conventional biocidal agents, including silver (used in wound care products), in various applications that require preservatives for example, personal care, household, industrial, food, pharmaceutical, cosmetic, healthcare, marine, paint, coating, energy, plastic, packaging, and agricultural products, or in any of the products or systems disclosed herein. In some embodiments, the mature polypeptide is a transglutaminase, such as a bacterial transglutaminase, for example, the wild type transglutaminase from Streptomyces mobaraensis (SEQ ID NO:7) or a variant thereof. In one embodiment, the transglutaminase is the wild type transglutaminase from Streptomyces mobaraensis (SEQ ID NO:7). In some embodiments, the mature polypeptide is expressed in a cell free protein expression system as described herein.

A mature enzyme of a zymogen, such as a transglutaminase, a laccase, a peroxidase, a transferase, a lysyl oxidase, a tyrosinase, a lipase, a peptidase, or a protease (including wild type mature polypeptides and variants thereof) may be used as an anti-microbial (e.g., preservative) ingredient that inhibits the growth of potentially harmful bacteria, fungi, and/or other microbes, and accordingly, is added to a product to be preserved in an effective amount to inhibit bacterial, fungal, and/or microbial growth in these products. In some embodiments, the anti-microbial mature enzyme is bacterial transglutaminase, for example, the wild type transglutaminase from Streptomyces mobaraensis (SEQ ID NO:7) or a variant thereof. In one embodiment, the transglutaminase is the wild type transglutaminase from Streptomyces mobaraensis (SEQ ID NO:7).

In some embodiments, USP <51> passing criteria are achieved, i.e., for Category 2 Products: Bacteria: No less than 2.0 log reduction from the initial calculated count at 14 days, and no increase from the 14 days' count at 28 days; for Yeast and Molds: No increase from the initial calculated count at 14 and 28 days. In some embodiments, the antimicrobial behavior of the enzymes and enzyme-biopolymer coformulations are characterized by MIC (minimum inhibitory concentration) against gram-positive and gram-negative bacteria as well as fungi, which results in reduction of microbial growth by approximately 80-100%, or any of at least about 80%, 85%, 90%, 95%, 98%, or 99% of microbial growth.

When combined with a product as described herein, e.g., a personal care, household, industrial, food, pharmaceutical, cosmetic, healthcare, marine, paint, coating, energy, plastic, packaging, or agricultural product, or in any of the products or systems disclosed herein, e.g., in a formulation or incorporated into a product or system as a preservative, the composition may have effective broad spectrum preservation activity over a broad pH range.

In some embodiments, a method is provided that includes adding a preservative composition as described herein (e.g., an expressed mature zymogen polypeptide, such as a crosslinking or lytic enzyme or other enzyme disclosed herein, e.g., a biocidal enzyme, or a composition thereof) to a product or system, such as a personal care, household, industrial, food, pharmaceutical, cosmetic, healthcare, marine, paint, coating, energy, plastic, packaging, or agricultural product, or in any of the products or systems disclosed herein, e.g., in a formulation or incorporated into a product or system, wherein microbial growth is decreased and/or shelf life of the product is increased in comparison to an identical product that does not contain the preservative composition. In some embodiments, the mature enzyme of a zymogen is an enzyme selected from a hydrolase, a protease, a lytic enzyme, and a cross-linking enzyme (e.g., a transglutaminase), for example, expressed in a cell free protein expression system as described herein. In some embodiments, the enzyme is a cross-linking enzyme, such as a transglutaminase, laccase, peroxidase, transferase, lysyl oxidase, or tyrosinase, or a combination thereof. In some embodiments, the anti-microbial mature enzyme is bacterial transglutaminase, for example, the wild type transglutaminase from Streptomyces mobaraensis (SEQ ID NO:7) or a variant thereof. In one embodiment, the transglutaminase is the wild type transglutaminase from Streptomyces mobaraensis (SEQ ID NO:7). In some embodiments, no other preservative is included in the product composition, such as, but not limited to formaldehyde and/or glutaraldehyde.

In some embodiments of the methods or compositions described herein, an enzyme (e.g., a zymogen-class enzyme, such as a crosslinking or lytic enzyme or other enzyme disclosed herein, e.g., a biocidal enzyme, for example, a transglutaminase, such as the transglutaminase of the wild type transglutaminase from Streptomyces mobaraensis (SEQ ID NO:7) or a variant thereof) may be included at a concentration of about 0.01% w/v to about 5% w/v, or any of at least about 0.01% w/v, 0.05% w/v, 0.1% w/v, 0.5% w/v, 1% w/v, 1.5% w/v, 2% w/v, 2.5% w/v, 3% w/v, 3.5% w/v, 4% w/v, 4.5% w/v, or 5% w/v, or any of about 0.01% w/v to about 0.05% w/v, about 0.1% w/v to about 0.5% w/v, about 1% w/v to about 1.5% w/v, about 1.5% w/v to about 2% w/v, about 2% w/v to about 2.5% w/v, about 2.5% w/v to about 3% w/v, about 3% w/v to about 3.5% w/v, about 3.5% w/v to about 4% w/v, about 4% w/v to about 4.5% w/v, about 4.5% w/v to about 5% w/v, about 0.01% w/v to about 0.1% w/v, about 0.1% w/v to about 1% w/v, about 1% to about 5% w/v, about 0.05% w/v to about 0.5% w/v, about 0.5% w/v to about 5% w/v, about 1% w/v to about 2.5% w/v, or about 2.5% w/v to about 5% w/v.

Examples of personal care products which may incorporate the disclosed compositions that include a mature enzyme of a zymogen include bar soap, liquid soap (e.g., hand soap), hand sanitizer (including rinse off and leave-on alcohol based and aqueous-based hand disinfectants), preoperative skin disinfectant, cleansing wipes, disinfecting wipes, body wash, acne treatment products, antifungal diaper rash cream, antifungal skin cream, shampoo, conditioner, cosmetics (including but not limited to liquid or powder foundation, liquid or solid eyeliner, mascara, cream eye shadow, tinted powder, “pancake” type powder to be used dry or moistened, make up removal products etc.) deodorant, antimicrobial creams, body lotion, hand cream, topical cream, aftershave lotion, skin toner, mouth wash, toothpaste, sunscreen lotion, and baby products such as, but not limited to, cleansing wipes, baby shampoo, baby soap, and diaper cream. The present subject matter may also be applied to wound care items, such as, but not limited to, wound healing ointments, creams, and lotions, wound coverings, burn wound cream, bandages, tape, and steri-strips, and medical articles such as medical gowns, caps, face masks, and shoe-covers, surgical drops, etc. Additional products include but are not limited to oral products such as mouth rinse, toothpaste, and dental floss coatings, veterinary and pet care products, preservative compositions, and surface disinfectants including solutions, sprays or wipes.

Non-limiting examples of household/industrial products which may incorporate the disclosed compositions that include a mature enzyme of a zymogen include householder cleaners such as concentrated liquid cleaners and spray cleaners, cleaning wipes, dish washing liquid, dish washer detergent, spray-mop liquid, furniture polish, indoor paint, outdoor paint, dusting spray, laundry detergent, fabric softener, rug/fabric cleaner, window and glass cleaner, toilet bowl cleaner, liquid/cream cleanser, etc. In a particular embodiment, the compositions of the present subject matter may be used in a food wash product, designed to clean fruits and vegetables prior to consumption, packaging, and food coatings.

The following examples are intended to illustrate, but not limit, the invention.

EXAMPLES Activity Assays

Tgase activity was measured in the examples herein using a colorimetric hydroxamate activity assay (Folk and Cole (1965) J Biol Chemistry 240(7):2951-2960). Briefly, the hydroxamate assay uses N-carbobenzoxy-L-glutaminylglycine (Z-Gln-Gly or CBZ-Gln-Gly) as the amine acceptor substrate and hydroxylamine as an amine donor. In the presence of transglutaminase, the hydroxylamine is incorporated to form Z-glutamylhydroxamate-glycine, which develops a colored complex with iron (III), detectable at 525 nm after incubation at 37° C. for 5-60 minutes. The calibration was performed using L-glutamic acid γ-monohydroxamate (Millipore Sigma) as standard. One unit of Tgase is defined as the amount of enzyme that catalyzes formation of 1 μmol of the peptide derivative of γ-glutamylhydroxylamine per minute.

Example 1. CFPS (In Vitro Transcription/Translation) of Mature Tease Using Bicistronic Plasmid for Expression of Active, Mature Polypeptide that does not Require Proteolytic Cleavage

The genes coding for the pro-sequence and mature Tgase were codon optimized for expression in E. coli based on the published amino acid sequence (Kanaji, et al. (1993) J. Biol. Chem. 268(16):11565-11572) and synthesized, separated by a linker region containing a stop codon, a 20-60 bp spacer, and a ribosome binding site and with a peptide spacer and hexa-histidine tag following the mature Tgase on a single DNA fragment (SEQ ID NO: 2). The DNA was sequence verified. The bicistronic DNA fragment was cloned onto a pUC19-derived vector downstream of a T7 promoter and a strong ribosome binding site using the XbaI and XhoI cloning sites. The resulting plasmid was transformed into E. coli using the standard heat shock method. The presence of the bicistronic plasmid in transformed, ampicillin resistant colonies was verified by colony PCR and sequencing. The bicistronic plasmid was recovered using standard protocols (QIAprep® Plasmid Miniprep Kit, QIAGEN, Inc., ZymoPURE™ Plasmid Miniprep Kit, Zymo Research Inc., Monarch® Plasmid Miniprep Kit, New England Biolabs, Inc.) and DNA concentration was adjusted appropriately.

CFPS Method

The bicistronic plasmid (1.5 nM) was expressed in the presence of pTXTL-P70a-T7map HP (Arbor Biosciences #502134) (0.05 nM) using supplemented cell extracts from E. coli, containing energy buffer, and a cocktail of amino acids, following the manufacturer's protocol (myTXTL® system, Arbor Bioscience, Inc). The reaction was incubated overnight at 20° C.

The activity of the cell free expressed mature Tgase was determined using the hydroxamate assay and normalized activity is reported in normalized activity per mL of cell free reaction mixture FIG. 3 ).

Example 2. CFPS (In Vitro Transcription/Translation) of Mature Tease Using Two Plasmids for Concurrent Expression of Active, Mature Polypeptide and Pro-Sequence

The gene encoding the mature Tgase sequence (SEQ ID NO: 3) was codon optimized for E. coli and synthesized, and the DNA was sequence verified. The fragment was cloned downstream of a T7 promoter and a strong ribosome binding site using the Ncol and XhoI cloning sites. The vector was derived from pUC19 and had a ColE1 origin of replication and ampicillin resistance marker. The resulting plasmid was transformed into E. coli using the standard heat shock method. The presence of the mature Tgase encoding plasmid in transformed colonies was verified by colony PCR and sequencing. The “mature Tgase plasmid” (FIG. 2A), containing the gene sequence depicted in SEQ ID NO: 3), was recovered using standard protocols (QIAprep® Plasmid Miniprep Kit, QIAgen, Inc., ZymoPURE™ Plasmid Miniprep Kit, Zymo Research Inc., Monarch® Plasmid Miniprep Kit, New England Biolabs, Inc.) and DNA concentration was adjusted appropriately.

The gene coding for the pro-sequence of Tgase (SEQ ID NO: 4) was codon optimized for E. coli, synthesized, and the DNA sequence was verified. The fragment was cloned downstream of a T7 promoter and a strong ribosome binding site using the Ncol and XhoI cloning sites. The vector was derived from pUC19 and had a ColE1 origin of replication and ampicillin resistance marker. The resulting plasmid was transformed into E. coli using the standard heat shock method. The presence of the pro-sequence encoding plasmid in transformed colonies was verified by colony PCR and sequencing. The “pro-sequence plasmid” (FIG. 2B), containing the gene sequence depicted in SEQ ID NO:4, was recovered using standard protocols (QIAprep® Plasmid Miniprep Kit, QIAGEN, Inc., ZymoPURE™ Plasmid Miniprep Kit, Zymo Research Inc., Monarch® Plasmid Miniprep Kit, New England Biolabs, Inc.) and DNA concentration was adjusted appropriately.

The “mature Tgase plasmid,” at a concentration of 1.5 nM, and the “pro-sequence plasmid,” at a concentration of 0.02 nM-7.5 nM, were expressed using the CFPS method described in Example 1. This dual vector approach allowed the ratios of the “pro-sequence encoding plasmid” and “mature Tgase plasmid” to be adjusted to afford high expression levels of active Tgase. The plasmids were added at a ratio of 1:0.2 to 1:75 “pro-sequence plasmid”: “mature Tgase plasmid.”

The activity of the cell free expressed mature Tgase was determined using the hydroxamate assay and ranged from 4.4 to 21.6 units of enzyme activity per mL of cell free reaction mixture. The normalized activity is reported in normalized activity per mL of cell free reaction mixture (FIG. 3 ).

Example 3. CFPS (In Vitro Transcription/Translation) of Mature Tgase Using with Added Synthetic Pro-Sequence Peptide

The “mature Tgase plasmid” (FIG. 2A) was expressed using the CFPS method described in Example 1, with the pro-sequence provided by 30 nM synthetically derived pro-peptide. Wild-type (SEQ ID NO:5) and mutant pro-sequences were chemically synthesized using conventional peptide synthesis methods. The activity of the cell free expressed mature Tgase was determined using the hydroxamate assay and is reported in normalized activity per mL of cell free reaction mixture (FIG. 3 ).

Example 4. CFPS (In Vitro Transcription/Translation) of Mature Tgase with Added Small Molecule Reversible Inhibitor

The “mature Tgase plasmid” (FIG. 2A) was expressed using the CFPS method described in Example 1. However, no pro-sequence (expressed, synthetic, etc.) was added. Instead, ammonium sulfate was added at a concentration of 0-50 mM. The activity of the cell free expressed mature Tgase was determined using the hydroxamate assay. Normalized activity is reported in FIG. 4 . Mature Tgase was expressed under optimized conditions and is reported in normalized activity per mL of cell free reaction mixture (FIG. 3 ).

Example 5. CFPS (In Vitro Transcription/Translation) of Mature Tgase without Pro-Sequence

The “mature Tgase plasmid” (FIG. 2A) was expressed using the CFPS method described in Example 1, in the absence of a source of pro-sequence or small molecule inhibitor. The activity of the cell free expressed mature Tgase was determined using the hydroxamate assay and is reported in normalized activity per mL of cell free reaction mixture (FIG. 3 ).

Example 6. CFPS (In Vitro Transcription/Translation) of Single Point Mutants of Mature Tgase

The gene coding for the mature Tgase sequence (SEQ ID NO: 3) was cloned as described in Example 2 and subjected to site-directed mutagenesis using well established techniques known in the art to include a serine-to-proline substitution at the second residue of the mature Tgase gene (Q5® Site-Directed Mutagenesis Kit, New England Biolabs, Inc.).

The linear product was re-circularized using T4 polynucleotide kinase and T4 DNA ligase (following manufacturer's protocols, New England Biolabs, Inc.) and residual plasmid was digested by DpnI. A second set of PCR primers flanking the promoter and terminator regions on the plasmid was used to amplify the mutant Tgase gene.

The PCR-amplified mutant Tgase gene was concurrently expressed with “pro-sequence plasmid” (described in Example 2) using the CFPS method described in Example 1. The activity of the cell free expressed mature Tgase was determined using the hydroxamate assay.

Example 7. Cytotoxic Action of Mature Tgase

Mature Tgase was expressed and purified as previously described (Javitt et al. BMC Biotech. 2017, 17, 23) and its cytotoxic activity was assessed using a commercially available kit to measure cell viability assay.

Yeast or bacterial starter cultures were grown at 30° C.-37° C. overnight. The following day, the cell density of the saturated cultures was calculated using OD₆₀₀ and cultures were diluted to 10⁵-10⁸ cells per mL. Cultures (100 μL) were made from the dilute starters in 96 well plates. Mutant or wild-type Tgase was added to each culture at 0.01-1 weight percent. The cultures were grown overnight at 30° C.-37° C. and growth curves were measured by a BioTek Synergy Plate Reader. The following day, a cell viability assay such as BacTiter Glo™ (Promega, following manufacturer's protocols) was used to assess cell survival rate in cultures exposed to the mature Tgase variant (SEQ ID NO: 6), wild-type Tgase (SEQ ID NO: 7), or no Tgase (FIGS. 5A-C). Cell viability is indicated by detection of luminescence. Where no bar is visible, no viable cells were detected.

Although the foregoing invention has been described in some detail by way of illustration and examples for purposes of clarity of understanding, it will be apparent to those skilled in the art that certain changes and modifications may be practiced without departing from the spirit and scope of the invention, which is delineated in the appended claims. Therefore, the description should not be construed as limiting the scope of the invention.

All publications, patents, and patent applications cited herein are hereby incorporated by reference in their entireties for all purposes and to the same extent as if each individual publication, patent, or patent application were specifically and individually indicated to be so incorporated by reference.

Nucleotide and Amino Acid Sequences SEQ ID NO: 1 Amino acid sequence of wild-type Tgase as a Zymogen (pre-pro-Tgase; GenBank AF531437.1) MRIRRRALVFATMSAVLCTAGFMPSAGEAAADNGAGEETKSYAETYRLTADDVA NINALNESAPAASSAGPSFRAPDSDDRVTPPAEPLDRMPDPYRPSYGRAETVVNNYI RKWQQVYSHRDGRKQQMTEEQREWLSYGCVGVTWVNSGQYPTNRLAFASFDED RFKNELKNGRPRSGETRAEFEGRVAKESFDEEKGFQRAREVASVMNRALENAHDE SAYLDNLKKELANGNDALRNEDARSPFYSALRNTPSFKERNGGNHDPSRMKAVIY SKHFWSGQDRSSSADKRKYGDPDAFRPAPGTGLVDMSRDRNIPRSPTSPGEGFVNF DYGWFGAQTEADADKTVWTHGNHYHAPNGSLGAMHVYESKFRNWSEGYSDFDR GAYVITFIPKSWNTAPDKVKQGWP SEQ ID NO: 2 DNA Sequence of the bicistronic gene encoding pro-sequence and mature Tgase as discrete polypeptides tctagaaataattttgtttaactttaagaaggagatataccatggacaatggtgctggcgaagaaaccaaatcctatgccgaaacctac cgtctgacggccgatgacgtcgcaaacattaatgcgctgaatgaatccgccccggccgccagctctgcgggtccgtcatttcgtgca ccgtaaggttgacagtttccgagccgcaacaatttcgtgtaactgtgagaaggcgatattatggacagcgatgatcgcgtgaccccg ccggccgaaccgctggatcgtatgccggacccgtatcgtccgtcttacggccgcgccgaaacggtggttaacaactacatccgtaa atggcagcaagtgtacagtcatcgtgatggtcgcaaacagcaaatgaccgaagaacagcgcgaatggctgtcgtatggctgcgtc ggtgtgacctgggttaacagcggccaatacccgacgaatcgtctggcctttgcatctttcgatgaagaccgctttaaaaacgaactga aaaatggccgtccgcgctcgggtgaaacgcgtgctgaatttgaaggccgcgtggcgaaagaatcttttgatgaagaaaaaggtttcc agcgtgcgcgcgaagttgcctccgtcatgaaccgtgcactggaaaatgctcacgatgaatcagcgtatctggacaatctgaagaaa gaactggcgaacggtaatgatgctctgcgtaacgaagacgcccgctctccgttttacagtgcactgcgtaataccccgtctttcaaag aacgcaacggcggtaatcatgatccgagtcgcatgaaagcagttatctactcgaaacacttctggagcggccaggatcgtagttcct cagcggacaaacgcaaatacggtgatccggatgccttccgtccggcaccgggcaccggtctggtcgatatgtcacgtgaccgtaa cattccgcgctcgccgacgagcccgggtgaaggttttgtgaatttcgattatggctggtttggtgcccagaccgaagctgatgcgga caaaaccgtttggacgcatggcaaccattatcacgctccgaatggctctctgggtgcaatgcacgtctacgaaagtaaatttcgtaact ggtccgaaggctattcagattttgaccgcggtgcgtacgttattacgttcatcccgaaaagttggaataccgcaccggacaaagtcaa acagggttggccgtaactcgag SEQ ID NO: 3 DNA Sequence of the mature Tgase gene atggacagcgatgatcgcgtgaccccgccggccgaaccgctggatcgtatgccggacccgtatcgtccgtcttacggccgcgccg aaacggtggttaacaactacatccgtaaatggcagcaagtgtacagtcatcgtgatggtcgcaaacagcaaatgaccgaagaacag cgcgaatggctgtcgtatggctgcgtcggtgtgacctgggttaacagcggccaatacccgacgaatcgtctggcctttgcatctttcg atgaagaccgctttaaaaacgaactgaaaaatggccgtccgcgctcgggtgaaacgcgtgctgaatttgaaggccgcgtggcgaa agaatcttttgatgaagaaaaaggtttccagcgtgcgcgcgaagttgcctccgtcatgaaccgtgcactggaaaatgctcacgatga atcagcgtatctggacaatctgaagaaagaactggcgaacggtaatgatgctctgcgtaacgaagacgcccgctctccgttttacagt gcactgcgtaataccccgtctttcaaagaacgcaacggcggtaatcatgatccgagtcgcatgaaagcagttatctactcgaaacact tctggagcggccaggatcgtagttcctcagcggacaaacgcaaatacggtgatccggatgccttccgtccggcaccgggcaccgg tctggtcgatatgtcacgtgaccgtaacattccgcgctcgccgacgagcccgggtgaaggttttgtgaatttcgattatggctggtttg gtgcccagaccgaagctgatgcggacaaaaccgtttggacgcatggcaaccattatcacgctccgaatggctctctgggtgcaatg cacgtctacgaaagtaaatttcgtaactggtccgaaggctattcagattttgaccgcggtgcgtacgttattacgttcatcccgaaaagt tggaataccgcaccggacaaagtcaaacagggttggccgtgatga SEQ ID NO: 4 DNA Sequence of the 45 amino acid Tgase pro-sequence atggacaatggtgctggcgaagaaaccaaatcctatgccgaaacctaccgtctgacggccgatgacgtcgcaaacattaatgcgct gaatgaatccgccccggccgccagctctgcgggtccgtcatttcgtgcaccgtaa SEQ ID NO: 5 Peptide Sequence of the 45 amino acid Tgase pro-sequence DNGAGEETKSYAETYRLTADDVANINALNESAPAASSAGPSFRAP SEQ ID NO: 6 Mature Streptomyces mobaraensis Tgase Variant DPDDRVTPPAEPLDRMPDPYRPSYGRAETVVNNYIRKWQQVYSHRDGRKQQMTE EQREWLSYGCVGVTWVNSGQYPTNRLAFASFDEDRFKNELKNGRPRSGETRAEFE GRVAKESFDEEKGFQRAREVASVMNRALENAHDESAYLDNLKKELANGNDALRN EDARSPFYSALRNTPSFKERNGGNHDPSRMKAVIYSKHFWSGQDRSSSADKRKYG DPDAFRPAPGTGLVDMSRDRNIPRSPTSPGEGFVNFDYGWFGAQTEADADKTVWT HGNHYHAPNGSLGAMHVYESKFRNWSEGYSDFDRGAYVITFIPKSWNTAPDKVK QGWP SEQ ID NO: 7 Mature Streptomyces mobaraensis Tgase Wild Type DSDDRVTPPAEPLDRMPDPYRPSYGRAETVVNNYIRKWQQVYSHRDGRKQQMTE EQREWLSYGCVGVTWVNSGQYPTNRLAFASFDEDRFKNELKNGRPRSGETRAEFE GRVAKESFDEEKGFQRAREVASVMNRALENAHDESAYLDNLKKELANGNDALRN EDARSPFYSALRNTPSFKERNGGNHDPSRMKAVIYSKHFWSGQDRSSSADKRKYG DPDAFRPAPGTGLVDMSRDRNIPRSPTSPGEGFVNFDYGWFGAQTEADADKTVWT HGNHYHAPNGSLGAMHVYESKFRNWSEGYSDFDRGAYVITFIPKSWNTAPDKVK QGWP 

We claim:
 1. A method for expressing a mature, active form of a zymogen or a proprotein, said method comprising: expressing a DNA template that comprises a DNA sequence that encodes a mature polypeptide sequence of a zymogen or proprotein in a cell free expression system that is capable of in vitro transcription from the DNA template and in vitro translation, wherein expression in the cell free expression system produces the mature enzyme or protein polypeptide, and wherein the method does not comprise post-translational processing or cleavage of a pro-sequence from the zymogen or proprotein to produce the mature polypeptide.
 2. The method according to claim 1, wherein the cell free expression system further comprises: (a) an energy mix that comprises one or more of: polysaccharides, rNTPs, tRNA, CoA, NAD, cAMP, folinic acid, spermidine, and 3-PGA; and (b) amino acids.
 3. The method according to claim 1, wherein the cell free expression system comprises a cell free extract of eukaryotic or prokaryotic cells.
 4. The method of claim 3, wherein the cell free expression system comprises a cell free extract of bacterial cells.
 5. The method according to claim 4, wherein the bacterial cells comprise E. coli cells.
 6. The method according to claim 1, wherein the mature polypeptide is expressed as a discrete polypeptide without a pro-sequence.
 7. The method according to claim 1, wherein the mature polypeptide is expressed in the presence of a polypeptide pro-sequence for the zymogen or proprotein.
 8. The method according to claim 7, wherein a DNA sequence that encodes the pro-sequence and the DNA sequence that encodes the mature polypeptide sequence of the zymogen or proprotein are expressed as discrete polypeptide sequences from the same DNA template.
 9. The method according to claim 7, wherein the DNA sequence that encodes the mature polypeptide is expressed from a first DNA template, and the DNA sequence that encodes the pro-sequence is expressed from a separate second DNA template.
 10. The method according to claim 7, wherein the pro-sequence is synthesized chemically and added to the cell free expression system prior to, during, or after expression of the mature polypeptide.
 11. The method according to any of claims 1-10, comprising expressing a mature transglutaminase enzyme (ED 2.3.2.13) in an active, mature form.
 12. The method according to claim 11, wherein the transglutaminase enzyme is the transglutaminase from Streptomyces mobaraensis (SEQ ID NO:7) or a variant thereof.
 13. The method according to claim 1, wherein the DNA template that encodes the mature polypeptide sequence of a zymogen or proprotein is an expression vector or a linear DNA fragment.
 14. The method according to claim 8, wherein the DNA template that encodes the mature polypeptide sequence and the pro-sequence polypeptide of a zymogen or proprotein is an expression vector or a linear DNA fragment.
 15. The method according to claim 9, wherein the first DNA template and the second DNA template are separate expression vectors or linear DNA fragments.
 16. The method according to any of claims 13-15, wherein the expression vector is a plasmid derived from pBR322, a pUC vector, or a pET vector.
 17. The method according to any of claims 13-15, wherein the linear DNA fragment is produced by de novo synthesis or by amplification of a template nucleic acid sequence that encodes the mature polypeptide.
 18. The method of claim 17, wherein the amplification comprises polymerase chain reaction (PCR).
 19. The method of claim 17, wherein production of the linear DNA fragment comprises a purification step.
 20. The method of claim 19, wherein purification of the linear DNA fragment is performed in the presence of a nuclease inhibitor.
 21. The method of claim 20, wherein the nuclease inhibitor comprises GamS.
 22. The method of claim 1, wherein the cell free expression system further comprises a reversible inhibitor of an activity of the mature zymogen or proprotein.
 23. The method of claim 1, wherein the mature polypeptide sequence of the zymogen or proprotein comprises one or more mutation in comparison to the wild type sequence of the zymogen or proprotein.
 24. The method according to claim 23, wherein the method comprises: introducing one or more mutation into the DNA sequence that encodes the mature polypeptide sequence of a zymogen or proprotein, wherein the one or more mutation results in a variant mature polypeptide sequence; and expressing the DNA template in the cell free expression system, thereby producing the variant mature polypeptide.
 25. A mature, active polypeptide of a zymogen or proprotein, produced according to the method of any of claims 1-24.
 26. A cell free expression system for expression of a mature, active form of a zymogen or a proprotein, comprising: (a) a nucleic acid template that comprises a DNA sequence that encodes a mature polypeptide sequence of a zymogen or proprotein; and (b) a cell free extract that is capable of in vitro transcription from the DNA template and in vitro translation to produce the mature polypeptide.
 27. The cell free expression system according to claim 26, further comprising: (c) an energy mix that comprises one or more of: polysaccharides, rNTPs, tRNA, CoA, NAD, cAMP, folinic acid, spermidine, and 3-PGA; and (d) amino acids.
 28. The cell free expression system according to claim 26, wherein the cell free extract comprises an extract of E. coli cells.
 29. The cell free expression system according to claim 26, wherein the DNA template encodes a mature sequence of a transglutaminase enzyme (ED 2.3.2.13).
 30. The cell free expression system of claim 29, wherein the transglutaminase enzyme is the transglutaminase from Streptomyces mobaraensis (SEQ ID NO:7) or a variant thereof.
 31. The cell free expression system according to any of claims 26-30, further comprising a DNA template comprises a DNA sequence that encodes a pro-sequence for the zymogen or proprotein.
 32. The cell free expression system of claim 31, wherein the DNA sequence that encodes the mature polypeptide and the DNA sequence that encodes the pro-sequence of a zymogen or proprotein are encoded as discrete sequences on the same DNA template.
 33. The cell free expression system of claim 31, wherein the DNA sequence that encodes the mature polypeptide is encoded on a first DNA template and the DNA sequence that encodes the pro-sequence is encoded on separate second DNA template for the zymogen or proprotein.
 34. The cell free expression system according to any of claims 26-30, further comprising a chemically synthesized polypeptide pro-sequence for the zymogen or proprotein, which is added to the cell free expression system prior to, during, or after production of the mature polypeptide.
 35. The cell free expression system according to any of claims 26-30, further comprising a reversible inhibitor of an activity of the mature enzyme or protein.
 36. The cell free expression system of any of claims 26-30, wherein the mature polypeptide sequence of the zymogen or proprotein is a variant that comprises one or more mutation in comparison to the wild type sequence of the zymogen or proprotein.
 37. An expression vector or linear DNA fragment that comprises a DNA sequence that encodes a mature polypeptide sequence of a zymogen or proprotein and a DNA sequence that encodes a pro-sequence for the zymogen or proprotein, encoded as discrete sequences and separated by a stop codon, a spacer region, a ribosome binding site, and a second start codon.
 38. A composition comprising: (a) a first expression vector or linear DNA fragment that comprises a DNA sequence that encodes a mature polypeptide for a zymogen or proprotein; and (b) a second expression vector or linear DNA fragment that comprises a DNA sequence that encodes a pro-sequence for the zymogen or proprotein.
 39. A method for high-throughput engineering of variants of a zymogen or proprotein, comprising: (i) introducing one or more mutation into a DNA sequence that encodes a mature polypeptide of a zymogen or proprotein, wherein the one or more mutation results in a change in one or more amino acid in the polypeptide sequence, thereby producing a DNA sequence that encodes a variant of the mature polypeptide; and (ii) expressing the DNA sequence in a cell free expression system according to any of claims 23-27, thereby producing a variant mature polypeptide.
 40. The method according to claim 37, wherein the variant DNA sequence is produced by site-directed mutagenesis or de novo DNA synthesis.
 41. The method according to claim 39, comprising producing a plurality of different DNA sequences that encode different variants of the mature polypeptide for a zymogen or proprotein, and expressing each of the different DNA sequences in individual cell free expression systems, thereby producing a plurality of different variant mature polypeptides.
 42. The method according to claim 41, further comprising assessing one or more property of the variant mature polypeptides, in comparison to the mature form of the zymogen or proprotein from which the variants are derived.
 43. The method according to claim 42, wherein the property comprises enzymatic activity and/or stability. 